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DETAILED ACTION 
Response to Amendment 

1 . The amendments to the claims submitted December 13, 2004 have been 
entered. Claims 1, 5-6, and 10-12 are currently amended, and new claim 13 has been 
added. 

Response to Arguments 

2. Applicant's arguments, see page 8, 2 nd and 3 rd paragraphs, filed December 13, 
2004, with respect to the rejection (s)of independent claim(s) 1 and 10-12 under 35 
U.S.C. 102(b) as being anticipated by Matsui et al. (U.S. Patent 5,835,890) have been 
fully considered and are persuasive. Therefore, the rejection has been withdrawn. 
However, upon further consideration, a new ground(s) of rejection is made under 35 
U.S.C. 103(a) as being unpatentable over Matsui et al., in view of McKinley et al. {Noise 
Model Adaptation in Model Based Speech Enhancement), and further in view of Pastor 
(U.S. Patent 5,572,623). 

3. Furthermore, with regard to the use of official notice in the rejection of claim 6, it 
is noted that the applicant has not made any attempt to traverse the assertion of official 
notice, therefore the well-known in the art statement is taken to be admitted prior art 
(see MPEP 2144.03). 



Application/Control Number: 09/942,896 Page 3 

Art Unit: 2655 

Claim Objections 

4. The amendments to the claims overcome the objections made in the previous 
office action. The objections to the claims are withdrawn. 

Claim Rejections - 35 USC § 102 

5. As discussed above, the rejections under 35 U.S.C. 102(b) as being anticipated 
by Matsui et al. (U.S. Patent 5,835,890) are withdrawn. However, upon further 
consideration, a new ground(s) of rejection is made under 35 U.S.C. 103(a) as being 
unpatentable over Matsui et al., in view of Pastor (U.S. Patent 5,572,623). 

Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

7. Claims 1, 4, 5, 8, and 10-12 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Matsui et al. (U.S. Patent 5,835,890), in view of McKinley et al. 
(Noise Model Adaptation in Model Based Speech Enhancement), and further in view of 
Pastor (U.S. Patent 5,572,623). 

In regard to claim 1 , Matsui et al. discloses an apparatus that includes a data 
extraction means (feature extracting part 1 1 , Fig. 3) (column 5, lines 34-40), as well as a 
model adaptation means (adaptation part 1 5). The model adaptation means adapts the 



Application/Control Number: 09/942,896 Page 4 

Art Unit: 2655 

extracted data by means of the most (maximum) likelihood method (column 6, lines 10- 
59). 

Matsui et al. does not disclose that the model is an acoustic model for ambient 

noise. 

McKinley et al. discloses a method of creating an acoustic model for ambient 
noise (noise model) and a method to adapt that noise model (sections 2 and 3). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Matsui et al. to create and adapt an acoustic model for ambient 
noise, as disclosed by McKinley et al., because "noise model adaptation is essential for 
proper operation", as taught by McKinley et al. (abstract and section 1). 

Neither Matsui et al. nor McKinley et al. disclose the model adaptation means 
adapts a model during a noise observation interval that ends when a speech switch is 
turned on when a user starts speech. 

Pastor discloses a method for collecting ambient noise frames for adapting an 
ambient noise model in a noisy environment, which adapts a model during a noise 
observation interval that ends when a speech switch is turned on when a user starts 
speech (microphone switching indicates a time area close to the speech signal, column 
4, lines 48-54; a first voiced frame is searched for in the vicinity of the switch time, 
column 4, lines 60-68; when found, the N2 frames which precede the voice framed are 
used as noise frames, column 4, line 66 to column 5, line 3; which is used to adapt 
noise models, column 5, lines 19-29). 
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It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Matsui and McKinley et al. to adapt the 
noise model during an observation interval that ends when a speech switch was turned 
on when a user started speech, since the noise frames immediately preceding the 
speaking interval would be the most similar to noise that occurred during the speaking 
interval, and would provide.the best noise models for improving subsequent recognition. 

In regard to claim 4, Matsui et al. discloses that the pattern recognition is 
performed on the basis of feature distribution in a feature space (recognition result step 
S5, column 7, lines 22-35). The model adaptation means (adaptation part 15) adapts 
the model using the feature distribution obtained from the extracted data (column 6, 
lines 10-21). 

In regard to claim 5, Matsui et al. discloses that a measure indicating the degree 
to which the extracted data is observed in the predetermined model becomes 
maximum, by means of the most (maximum) likelihood method (Equation 4, column 6, 
lines 10-59). 

In regard to claim 6, Matsui et al. discloses a model adaptation means that 
adapts a model so that a measure indicating the degree to which the extracted data is 
observed in a predetermined model becomes maximum by means of the maximum 
(most) likelihood method. Additionally, Matsui et al. discloses that the equation that 
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calculates a measure indicating the degree to which the observed data is observed in a 
predetermined model requires the model parameters of the predetermined models 
(Equation 4, 6\ see column 1, lines 51-67 and column 2, lines 1-7 for a definition of0). 

The combination of Matsui et al. p McKinley et al., and Pastor, as applied to claim 
1, above, does not disclose the model adaptation means determines a parameter of the 
predetermined model, which give a maximum value based on the maximum (most) 
likelihood method, by means of the Newton descent method or the Monte Carlo method. 

The examiner takes official notice that it is well known in the art to use the Monte 
Carlo method to estimate statistical parameters in a Gaussian distribution. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Matsui et al. so the statistical parameters of a predetermined model 
were estimated by means of the Monte Carlo method, so the calculations could be done 
more quickly than calculating the exact statistical parameters of a predetermined model. 

In regard to claim 8, Matsui et al. discloses that the input data is voice (speech) 
data (column 5, 34-40). 

In regard to claim 9, Matsui does not disclose that the predetermined model is an 
acoustic model representing input data during an interval which is not a voice interval. 
An interval which is not a voice interval has been interpreted herein as a "no speech" 
interval. 
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McKinley et al. discloses a method of creating an acoustic model representing 
input data during an interval which is not a voice interval (noise model) and a method to 
adapt that noise model (sections 2 and 3). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Matsui et al., McKinley et al., and Pastor, 
as applied to claim 1 , above, to create an acoustic model representing input data during 
an interval which was not a voice interval, as disclosed by McKinley et al., because 
"noise model adaptation is essential for proper operation", as taught by McKinley et al. 
(abstract and section 1). 

In regard to claim 10, Matsui et al. discloses a method of adapting a model used 
in pattern recognition which includes the steps of: 

Extracting input data corresponding to a predetermined model, observed during a 
predetermined interval, and then outputting the extracted data (Fig. 2, step S1, column 
5, lines 34-40). 

And adapting the predetermined model extracted during the predetermined 
interval by means of the most (maximum) likelihood method (Fig. 2, step S3, column 6, 
lines 10-59). 

Matsui et al. does not disclose that the model is an acoustic model for ambient 

noise. 

McKinley et al. discloses a method of creating an acoustic model for ambient 
noise (noise model) and a method to adapt that noise model (sections 2 and 3). 
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It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Matsui et al. to create and adapt an acoustic model for ambient 
noise, as disclosed by McKinley et aL, because "noise model adaptation is essential for 
proper operation", as taught by McKinley et al. (abstract and section 1). 

Neither Matsui et al. nor McKinley et al. disclose the model adaptation means 
adapts a model during a noise observation interval that ends when a speech switch is 
turned on when a user starts speech. 

Pastor discloses a method for collecting ambient noise frames for adapting an 
ambient noise model in a noisy environment, which adapts a model during a noise 
observation interval that ends when a speech switch is turned on when a user starts 
speech (microphone switching indicates a time area close to the speech signal, column 
4, lines 48-54; a first voiced frame is searched for in the vicinity of the switch time, 
column 4, lines 60-68; when found, the N2 frames which precede the voice framed are 
used as noise frames, column 4, line 66 to column 5, line 3; which is used to adapt 
noise models, column 5, lines 19-29). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Matsui and McKinley et al. to adapt the 
noise model during an observation interval that ends when a speech switch was turned 
on when a user started speech, since the noise frames immediately preceding the 
speaking interval would be the most similar to noise that occurred during the speaking 
interval, and would provide the best noise models for improving subsequent recognition. 
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In regard to claim 1 1 , Matsui et al. discloses a storage medium (Fig. 3, memory 
10M) which stores a program for executing using a computer (control part 10) 
adaptation of a model used in pattern recognition (column 5, lines 20-33). The program 
comprises the steps of: 

Extracting input data corresponding to a predetermined model, observed during a 
predetermined interval, and then outputting the extracted data (step S1, column 5, lines 
34-40). 

And adapting the predetermined model extracted during the predetermined 
interval by means of the most (maximum) likelihood method (step S3, column 6, lines 
10-59). 

Matsui et al. does not disclose that the model is an acoustic model for ambient 

noise. 

McKinley et al. discloses a method of creating an acoustic model for ambient 
noise (noise model) and a method to adapt that noise model (sections 2 and 3). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Matsui et al. to create and adapt an acoustic model for ambient 
noise, as disclosed by McKinley et al., because "noise model adaptation is essential for 
proper operation", as taught by McKinley et al. (abstract and section 1). 

Neither Matsui et al. nor McKinley et al. disclose the model adaptation means 
adapts a model during a noise observation interval that ends when a speech switch is 
turned on when a user starts speech. 
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Pastor discloses a method for collecting ambient noise frames for adapting an 
ambient noise model in a noisy environment, which adapts a model during a noise 
observation interval that ends when a speech switch is turned on when a user starts 
speech (microphone switching indicates a time area close to the speech signal, column 
4, lines 48-54; a first voiced frame is searched for in the vicinity of the switch time, 
column 4, lines 60-68; when found, the N2 frames which precede the voice framed are 
used as noise frames, column 4, line 66 to column 5, line 3; which is used to adapt 
noise models, column 5, lines 19-29). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Matsui and McKinley et al. to adapt the 
noise model during an observation interval that ends when a speech switch was turned 
on when a user started speech, since the noise frames immediately preceding the 
speaking interval would be the most similar to noise that occurred during the speaking 
interval, and would provide the best noise models for improving subsequent recognition. 

In regard to claim 12, Matsui et al. discloses an apparatus for classifying input 
data in the form of a time series into one of a predetermined number of models. The 
apparatus includes: 

A feature extraction means for extracting a feature value of input data and a data 
extraction means for extracting input data corresponding to a predetermined model 
observed during a predetermined interval that outputs the extracted data (feature 
parameter extracting part, Fig. 3,11, extracts a feature value of the input data and 
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extracts input data corresponding to a predetermined model observed during a 
predetermined interval, column 5, lines 34-48). 

A storage means for storing a predetermined number of models (12). 

A classifying means for classifying the feature value of the input data into one of 
said predetermined number of models (model sequence selecting part 13 selects model 
that are most closely matched to the input feature parameter sequence, column 5, lines 
41-67 and column 6, lines 1-9). 

And a model adaptation means for adapting the predetermined model using data 
extracted during the predetermined interval by means of the most (maximum) likelihood 
method (adaptation part 15, column 6, lines 10-59). 

Matsui et al. does not disclose that the model is an acoustic model for ambient 

noise. 

McKinley et al. discloses a method of creating an acoustic model for ambient 
noise (noise model) and a method to adapt that noise model (sections 2 and 3). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Matsui et al. to create and adapt an acoustic model for ambient 
noise, as disclosed by McKinley et al., because "noise model adaptation is essential for 
proper operation", as taught by McKinley et al. (abstract and section 1 ). 

Neither Matsui et al. nor McKinley et al. disclose the model adaptation means 
adapts a model during a noise observation interval that ends when a speech switch is 
turned on when a user starts speech. 
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Pastor discloses a method for collecting ambient noise frames for adapting an 
ambient noise model in a noisy environment, which adapts a model during a noise 
observation interval that ends when a speech switch is turned on when a user starts 
speech (microphone switching indicates a time area close to the speech signal, column 
4, lines 48-54; a first voiced frame is searched for in the vicinity of the switch time, 
column 4, lines 60-68; when found, the N2 frames which precede the voice framed are 
used as noise frames, column 4, line 66 to column 5, line 3; which is used to adapt 
noise models, column 5, lines 19-29). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Matsui and McKinley et al. to adapt the 
noise model during an observation interval that ends when a speech switch was turned 
on when a user started speech, since the noise frames immediately preceding the 
speaking interval would be the most similar to noise that occurred during the speaking 
interval, and would provide the best noise models for improving subsequent recognition. 

8. Claims 2 and 3 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Matsui et al., in view of McKinley et al. in further in view of Pastor, and further in view of 
Rao et al. (U.S. Patent 5,978,760). 

The combination of Matsui et al., McKinley et al., and Pastor, as applied to claim 
1 , above, does not disclose that the model adaptation means by using a freshness 
degree indicating the freshness of the extracted data. Matsui et al., McKinley et al., and 
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Pastor also do not disclose that the freshness degree is a function which varies 
depending on the temporal position of the extracted data. 

Rao et al. discloses a noise parameter generator (Fig. 4, 40) that creates a noise 
model (noise parameters) that includes a freshness degree (exponential weighting 
function) that indicates the freshness of the extracted data (column 4, lines 4-43). The 
freshness degree (exponential weight) varies depending on the temporal position of the 
extracted data (column 3, lines 43-51). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Matsui et al., McKinley et al., and Pastor 
so the model adaptation means also included a freshness degree to indicate the 
freshness of the extracted data, wherein the freshness degree was a function that 
varied depending on the temporal position of the extracted data, as disclosed by Rao et 
al., so that the model adaptation means would create a noise model that that 
represented the actual background noise better than a noise model without a weighting, 
as taught by Rao et al. (column 3, lines 9-1 1 ). 

9. Claim 7 is rejected under 35 U.S.C. 103(a) as being unpatentable over Matsui et 
al. in view of McKinley et al., in further in view of Pastor, and further in view of Komori et 
al. (U.S. Patent 6,108,628). 

Matsui et al. discloses that the model adaptation means adapts a model so that a 
measure indicating the degree to which the extracted data is observed in a 
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predetermined model becomes maximum by means of the most (maximum) likelihood 
method. 

The combination of Matsui et al., McKinley et al., and Pastor, as applied to claim 
1, above, does not disclose that the model adaptation means adapts a model so that a 
measure indicating the degree to which the extracted data is observed in a 
predetermined model becomes maximum or minimum by means of the minimum 
distance-maximum separation theorem, or that the measure is defined using a 
Bhattacharyya distance. 

Komori et al. discloses that a measure indicating the degree to which extracted 
data is observed in a predetermined model becomes minimum by means of the 
minimum distance theorem (column 4, lines 24-55). The measure is defined using a 
Bhattacharyya distance (Equation 2). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to alternatively further modify the combination of Matsui et al., McKinley et al., 
and Pastor, so models were represented in a feature space and so the degree to which 
extracted data was observed in a predetermined model was determined by the 
minimum distance theorem with the distance measure defined as a Bhattacharyya 
distance, as taught by Komori et al., in order to measure the similarity between two 
models in a more detailed manner by replacing the phoneme model HMM of Matsui et 
al. with a more detailed HMM dependent on the phoneme environment, as taught by 
Komori et al. (column 3, lines 56-67 and column 4, lines 1-7). 
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Allowable Subject Matter 

10. Claim 13 is objected to as being dependent upon a rejected base claim, but 
would be allowable if rewritten in independent form including all of the limitations of the 
base claim and any intervening claims. 

The following is a statement of reasons for the indication of allowable subject 
matter: the prior art of record does not disclose and would not suggest to one of 
ordinary skill in the art at the time of invention a noise observation interval which is split 
into two sub-intervals wherein the acoustic model is adapted during a second sub- 
interval. The closest prior art of record teaches collecting and adapting a noise model 
simultaneously, not performing the data extraction and adapting at two seperate time 
periods (sub-intervals). 

Conclusion 

1 1 . The prior art made of record and not relied upon is considered pertinent to 
applicants disclosure. Yamaguchi et al. (U.S. Patent 6,026,359) and Chiang (U.S. 
Patent 6,188,982 disclose additional methods for noise model adaptation. Gong (U.S. 
Patent 6,418,411) disclose a method that adapts a noise model during an interval 
between the time a speech switch is activated and an audible sound that indicates to 
the user to begin speaking. 

12. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
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§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

13. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian L Albertalli whose telephone number is (571) 272- 
7616. The examiner can normally be reached on Mon - Fri, 8:00 AM - 5:30 PM, every 
second Fri off. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Ometz can be reached on (571 ) 272-7593. The fax phone number for 
the organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 

BLA 5/4/2005 




